117 research outputs found

    Exploring the Geography of Tags in Youtube Views

    Get PDF
    Although tags play a critical role in many social media,their link to the geographic distribution of user generatedvideos has been little investigated. In this paper, we ana-lyze the correlation between the geographic distribution ofa video’s views and the tags attached to this video in aYoutube dataset. We show that tags can be interpreted asmarkers of a video’s geographic diffusion, with some tagsstrongly linked to well identified geographic areas. Basedon our findings, we explore whether the distribution of avideo’s views can be predicted from its tags. We demon-strate how this predictive power could help improve on-linevideo services by preferentially storing videos close to wherethey are likely to be viewed. Our results show that even witha simplistic approach we are able to predict a minimum of65.9% of a video’s views for a majority of videos, and thata tag-based placement strategy can improve the hit rate ofa distributed on-line video service by up to 6.8% globally,with an improvement of up to 34% in the USA

    Simple, Efficient and Convenient Decentralized Multi-Task Learning for Neural Networks

    Get PDF
    Artificial intelligence relying on machine learning is increasingly used on small, personal, network-connected devices such as smartphones and vocal assistants, and these applications will likely evolve with the development of the Internet of Things. The learning process requires a lot of data, often real users’ data, and computing power. Decentralized machine learning can help to protect users’ privacy by keeping sensitive training data on users’ devices, and has the potential to alleviate the cost born by service providers by off-loading some of the learning effort to user devices. Unfortunately, most approaches proposed so far for distributed learning with neural network are mono-task, and do not transfer easily to multi-tasks problems, for which users seek to solve related but distinct learning tasks and the few existing multi-task approaches have serious limitations. In this paper, we propose a novel learning method for neural networks that is decentralized, multitask, and keeps users’ data local. Our approach works with different learning algorithms, on various types of neural networks. We formally analyze the convergence of our method, and we evaluateits efficiency in different situations on various kind of neural networks, with different learning algorithms, thus demonstrating its benefits in terms of learning quality and convergence

    GOSSIPKIT: A Unified Component Framework for Gossip

    Get PDF
    International audienceAlthough the principles of gossip protocols are relatively easy to grasp, their variety can make their design and evaluation highly time consuming. This problem is compounded by the lack of a unified programming framework for gossip, which means developers cannot easily reuse, compose, or adapt existing solutions to fit their needs, and have limited opportunities to share knowledge and ideas. In this paper, we consider how component frameworks, which have been widely applied to implement middleware solutions, can facilitate the development of gossip-based systems in a way that is both generic and simple. We show how such an approach can maximise code reuse, simplify the implementation of gossip protocols, and facilitate dynamic evolution and re-deployment

    Good-case Early-Stopping Latency of Synchronous Byzantine Reliable Broadcast: The Deterministic Case (Extended Version)

    Full text link
    This paper considers the good-case latency of Byzantine Reliable Broadcast (BRB), i.e., the time taken by correct processes to deliver a message when the initial sender is correct. This time plays a crucial role in the performance of practical distributed systems. Although significant strides have been made in recent years on this question, progress has mainly focused on either asynchronous or randomized algorithms. By contrast, the good-case latency of deterministic synchronous BRB under a majority of Byzantine faults has been little studied. In particular, it was not known whether a goodcase latency below the worst-case bound of t + 1 rounds could be obtained. This work answers this open question positively and proposes a deterministic synchronous Byzantine reliable broadcast that achieves a good-case latency of max(2, t + 3 -- c) rounds, where t is the upper bound on the number of Byzantine processes and c the number of effectively correct processes

    GĂ©odistribution des tags et des vues dans Youtube

    Get PDF
    International audienceDans cet article, nous analysons la corrĂ©lation entre la distribution gĂ©ographique des vues d'une vidĂ©o et les tags de cette vidĂ©o au sein d'un dataset YouTube. Nous montrons que les tags peuvent servir d'indice sur la diffusion gĂ©ographique d'une vidĂ©o, avec certains tags trĂšs fortement liĂ©s Ă  des zones gĂ©ographiques bien dĂ©finies. Cette corrĂ©lation peut ĂȘtre exploitĂ©e pour prĂ©dire correctement un minimum de 68% des vues pour une majoritĂ© des vidĂ©os

    D.1.2 – Modular quasi-causal data structures

    Get PDF
    GDD_HCERES2020In large scale systems such as the Internet, replicating data is an essential feature in order to provide availability and fault-tolerance. Attiya and Welch proved that using strong consistency criteria such as atomicity is costly as each operation may need an execution time linear with the latency of the communication network. Weaker consistency criteria like causal consistency and PRAM consistency do not ensure convergence. The different replicas are not guaranteed to converge towards a unique state. Eventual consistency guarantees that all replicas eventually converge when the participants stop updating. However, it fails to fully specify the semantics of the operations on shared objects and requires additional non-intuitive and error-prone distributed specification techniques. In addition existing consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. In this deliverable, we address these issues with two novel contributions. The first contribution proposes a notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. We use this graph to provide a generic approach to the hybridization of data consistency conditions into the same system. Based on this, we design a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). The second contribution of this deliverable focuses on improving the limitations of eventual consistency. To this end, we formalize a new consistency criterion, called update consistency, that requires the state of a replicated object to be consistent with a linearization of all the updates. In other words, whereas atomicity imposes a linearization of all of the operations, this criterion imposes this only on updates. Consequently some read operations may return outdated values. Update consistency is stronger than eventual consistency , so we can replace eventually consistent objects with update consistent ones in any program. Finally, we prove that update consistency is universal, in the sense that any object can be implemented under this criterion in a distributed system where any number of nodes may crash

    La cohérence en oeil de poisson : maintenir la synchronisation des données dans un monde géo-répliqué

    Get PDF
    Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this paper introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of the paper is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so the paper not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems.Au cours des trente derniĂšres annĂ©es, de nombreuses conditions de cohĂ©rence pour les donnĂ©es rĂ©pliquĂ©es ont Ă©tĂ© proposĂ©es et mises en oeuvre. Les exemples courants de ces conditions comprennent la linĂ©arisabilitĂ© (ou atomicitĂ©), la cohĂ©rence sĂ©quentielle, la cohĂ©rence causale, et la cohĂ©rence Ă©ventuelle. Ces conditions de cohĂ©rence sont gĂ©nĂ©ralement dĂ©finies indĂ©pendamment des entitĂ©s informatiques (noeuds) qui manipulent les donnĂ©es rĂ©pliquĂ©es; c'est Ă  dire qu'elles ne prennent pas en compte la façon dont les entitĂ©s informatiques peuvent ĂȘtre liĂ©es les unes aux autres, ou gĂ©ographiquement distribuĂ©es. Pour combler ce manque, ce document introduit la notion de graphe de proximitĂ© entre les noeuds de calcul d'un systĂšme rĂ©parti. Si deux noeuds sont connectĂ©s dans ce graphe, leurs activitĂ©s doivent satisfaire une condition de cohĂ©rence forte, tandis que les opĂ©rations invoquĂ©es par d'autres noeuds peuvent ne satisfaire qu'une condition plus faible. Nous proposons d'utiliser un tel graphe pour fournir une approche gĂ©nĂ©rique Ă  l'hybridation de conditions de cohĂ©rence des donnĂ©es dans un mĂȘme systĂšme. Nous illustrons cette approche sur l'exemple de la cohĂ©rence sĂ©quentielle et de la cohĂ©rence causale, et prĂ©sentons un modĂšle dans lequel, d'une part, toutes les opĂ©rations sont causalement cohĂ©rentes, et, d'autre part, les opĂ©rations par des processus qui sont voisins dans le graphe de proximitĂ© satisfont la cohĂ©rence sĂ©quentielle. Nous proposons et prouvons un algorithme distribuĂ© basĂ© sur ce graphe de proximitĂ©, qui combine la cohĂ©rence sĂ©quentielle et la cohĂ©rence causal (nous appelons la cohĂ©rence obtenue cohĂ©rence en oeil de poisson). Ce faisant, le papier non seulement Ă©tend le domaine des conditions de cohĂ©rence, mais fournit une solution algorithmiquement correcte et gĂ©nĂ©rique directement applicable aux systĂšmes gĂ©o-rĂ©partis modernes

    Context Adaptive Cooperation

    Full text link
    Reliable broadcast and consensus are the two pillars that support a lot of non-trivial fault-tolerant distributed middleware and fault-tolerant distributed systems. While they have close definitions, they strongly differ in the underlying assumptions needed to implement each of them. Reliable broadcast can be implemented in asynchronous systems in the presence of crash or Byzantine failures while Consensus cannot. This key difference stems from the fact that consensus involves synchronization between multiple processes that concurrently propose values, while reliable broadcast simply involves delivering a message from a predefined sender. This paper strikes a balance between these two agreement abstractions in the presence of Byzantine failures. It proposes CAC, a novel agreement abstraction that enables multiple processes to broadcast messages simultaneously, while guaranteeing that (despite potential conflicts, asynchrony, and Byzantine behaviors) the non-faulty processes will agree on messages deliveries. We show that this novel abstraction can enable more efficient algorithms for a variety of applications (such as money transfer where several people can share a same account). This is obtained by focusing the need for synchronization only on the processes that actually need to synchronize
    • 

    corecore